A Coverage-based Approach to Nondiscrimination-aware Data Transformation

نویسندگان

چکیده

The development of technological solutions satisfying nondiscriminatory requirements is one the main current challenges for data processing. Back-end operators preparing, i.e., extracting and transforming, play a relevant role w.r.t. nondiscrimination, since they can introduce bias with an impact on entire life-cycle. In this article, we focus back-end transformations , defined in terms Select-Project-Join queries, coverage . Coverage aims at guaranteeing that input, or training, dataset includes enough examples each (protected) category interest, thus increasing diversity aim limiting introduction during next analytical steps. article proposes approach to automatically rewrite transformation result violates constraints, into “closest” query constraints. approximate relies sample-based cardinality estimation, it introduces trade-off between accuracy efficiency. efficiency effectiveness are experimentally validated synthetic real data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

a new approach to credibility premium for zero-inflated poisson models for panel data

هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...

15 صفحه اول

TAC: A Topology-Aware Chord-based Peer-to-Peer Network

Among structured Peer-to-Peer systems, Chord has a general popularity due to its salient features like simplicity, high scalability, small path length with respect to network size, and flexibility on node join and departure. However, Chord doesn’t take into account the topology of underlying physical network when a new node is being added to the system, thus resulting in high routing late...

متن کامل

A Chance Constraint Approach to Multi Response Optimization Based on a Network Data Envelopment Analysis

In this paper, a novel approach for multi response optimization is presented. In the proposed approach, response variables in treatments combination occur with a certain probability. Moreover, we assume that each treatment has a network style. Because of the probabilistic nature of treatment combination, the proposed approach can compute the efficiency of each treatment under the desirable reli...

متن کامل

A transformation-based approach to argument labeling

This paper presents the results of applying transformation-based learning (TBL) to the problem of semantic role labeling. The great advantage of the TBL paradigm is that it provides a simple learning framework in which the parallel tasks of argument identification and argument labeling can mutually influence one another. Semantic role labeling nevertheless differs from other tasks in which TBL ...

متن کامل

Presenting a Hybrid Approach based on Two-stage Data Envelopment Analysis to Evaluating Organization Productivity

   Measuring the performance of a production system has been an important task in management for purposes of control, planning, etc. Lord Kelvin said :“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” Hence, manag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Data and Information Quality

سال: 2022

ISSN: ['1936-1963', '1936-1955']

DOI: https://doi.org/10.1145/3546913